Bitmap approach to trend clustering for prediction in time series databases
نویسندگان
چکیده
This paper describes a bitmap approach to clustering and prediction of trends in time-series databases. Similar trend patterns, rather than similar data patterns, are extracted from time-series database. We consider four types of matches: (1) Exact match, (2) Similarity match, (3) Exact match by shift, and (4) Similarity match by shift. Each pair of time-series data may be matched in one of these four types if this pair is similar one to another, by similarity (or sim) notion over a threshold. Matched data can be clustered by the same way of matching. To improve performance, we use the notion of center of a cluster. The radius of a cluster is used to determine whether a given time-series data is included in the cluster. We also use a new notion of dissimilarity, called dissim, to make accurate clusters. It is likely that a time-series data is in one cluster rather than in another by using both notions, sim and dissim: a data is similar to one cluster while it is dissimilar to another. For a trend sequence, the cluster that is dissimilar to that sequence is called dissimilar-cluster. For a cluster, the notion dissim can be also used to identify a set of clusters that are dissimilar to the given cluster. A prediction of a trend can be made by (1) Intra-cluster Trend Prediction that refers to the trends in the cluster as that trend is involved and (2) Inter-cluster Trend Prediction that refers to other trends in the cluster that is dissimilar to that trend. The contribution of this paper includes (1) clustering by using not only similarity match but also dissimilarity match. In this way we prevent any positive and negative failures. (2) Prediction by using not only similar trend sequences but also dissimilar trend sequences. (3) A bitmap approach can improve performance of clustering and prediction.
منابع مشابه
Fuzzy clustering of time series data: A particle swarm optimization approach
With rapid development in information gathering technologies and access to large amounts of data, we always require methods for data analyzing and extracting useful information from large raw dataset and data mining is an important method for solving this problem. Clustering analysis as the most commonly used function of data mining, has attracted many researchers in computer science. Because o...
متن کاملStock Trend Analysis and Trading Strategy
This paper outlines a data mining approach to analysis and prediction of the trend of stock prices. The approach consists of three steps, namely partitioning, analysis and prediction. A modification of the commonly used k-means clustering algorithm is used to partition stock price time series data. After data partition, linear regression is used to analyse the trend within each cluster. The res...
متن کاملA Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach
In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...
متن کاملRisk prediction based on a time series case study: Tazareh coal mine
In this work, the time series modeling was used to predict the Tazareh coal mine risks. For this purpose, initially, a monthly analysis of the risk constituents including frequency index and incidence severity index was performed. Next, a monthly time series diagram related to each one of these indices was for a nine year period of time from 2005 to 2013. After extrusion of the trend, seasonali...
متن کاملCombination of Transformed-means Clustering and Neural Networks for Short-Term Solar Radiation Forecasting
In order to provide an efficient conversion and utilization of solar power, solar radiation datashould be measured continuously and accurately over the long-term period. However, the measurement ofsolar radiation is not available to all countries in the world due to some technical and fiscal limitations. Hence,several studies were proposed in the literature to find mathematical and physical mod...
متن کامل